Data harmonization and federated analysis of population-based studies: the BioSHaRE project

نویسندگان

  • Dany Doiron
  • Paul Burton
  • Yannick Marcon
  • Amadou Gaye
  • Bruce H R Wolffenbuttel
  • Markus Perola
  • Ronald P Stolk
  • Luisa Foco
  • Cosetta Minelli
  • Melanie Waldenberger
  • Rolf Holle
  • Kirsti Kvaløy
  • Hans L Hillege
  • Anne-Marie Tassé
  • Vincent Ferretti
  • Isabel Fortier
چکیده

BACKGROUND Individual-level data pooling of large population-based studies across research centres in international research projects faces many hurdles. The BioSHaRE (Biobank Standardisation and Harmonisation for Research Excellence in the European Union) project aims to address these issues by building a collaborative group of investigators and developing tools for data harmonization, database integration and federated data analyses. METHODS Eight population-based studies in six European countries were recruited to participate in the BioSHaRE project. Through workshops, teleconferences and electronic communications, participating investigators identified a set of 96 variables targeted for harmonization to answer research questions of interest. Using each study's questionnaires, standard operating procedures, and data dictionaries, harmonization potential was assessed. Whenever harmonization was deemed possible, processing algorithms were developed and implemented in an open-source software infrastructure to transform study-specific data into the target (i.e. harmonized) format. Harmonized datasets located on server in each research centres across Europe were interconnected through a federated database system to perform statistical analysis. RESULTS Retrospective harmonization led to the generation of common format variables for 73% of matches considered (96 targeted variables across 8 studies). Authenticated investigators can now perform complex statistical analyses of harmonized datasets stored on distributed servers without actually sharing individual-level data using the DataSHIELD method. CONCLUSION New Internet-based networking technologies and database management systems are providing the means to support collaborative, multi-center research in an efficient and secure manner. The results from this pilot project show that, given a strong collaborative relationship between participating studies, it is possible to seamlessly co-analyse internationally harmonized research databases while allowing each study to retain full control over individual-level data. We encourage additional collaborative research networks in epidemiology, public health, and the social sciences to make use of the open source tools presented herein.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing a Theoretical and Operational Knowledge Audit Model for Project- Based Organizations

Background/aim. Considering the underlying role played by knowledge management in project-based organizations; and, the fact that knowledge audit is the most important step in supplying, maintaining and updating the content of knowledge management systems; this research effort is aimed at designing an appropriate knowledge audit model based on the requirements and factors of knowledge audit in ...

متن کامل

Road traffic noise, air pollution and cardio-respiratory health in European cohorts: a harmonised approach in the BioSHaRE project

Background and aims: Few studies have investigated joint effects of road traffic noise and air pollution on cardiovascular outcomes. This project aims to quantify the joint and separate effects of both exposures on prevalent and incident cardiovascular disease and asthma as part of the EU-funded BioSHaRE project involving five European cohorts (EPIC-Oxford, EPIC-Turin, HUNT, Lifelines, UK Bioba...

متن کامل

European Project on Osteoarthritis (EPOSA): methodological challenges in harmonization of existing data from five European population-based cohorts on aging

BACKGROUND The European Project on OSteoArthritis (EPOSA), here presented for the first time, is a collaborative study involving five European cohort studies on aging. This project focuses on the personal and societal burden and its determinants of osteoarthritis (OA). The aim of the current report is to describe the purpose of the project, the post harmonization of the cross-national data and ...

متن کامل

Designing Competency Model for Project Managers in Petroleum Industry

Due to rapid environmental changes and the large scope of project activities in the petroleum industry, project management is of particular importance. One of the essential and important factors for the success of projects is to focus on the behavioral characteristics, and in other words, the competencies of project managers. In this regard, designing a model of project managers' competencies i...

متن کامل

BiobankConnect: software to rapidly connect data elements for pooled analysis across biobanks using ontological and lexical indexing

OBJECTIVE Pooling data across biobanks is necessary to increase statistical power, reveal more subtle associations, and synergize the value of data sources. However, searching for desired data elements among the thousands of available elements and harmonizing differences in terminology, data collection, and structure, is arduous and time consuming. MATERIALS AND METHODS To speed up biobank da...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2013